# Code for ICLR 2023

## Requirements

### Environment:

- Python 3.8.5
- Ubuntu 20.04

### Setup:
```
# Create python environment (optional)
conda create -n pyt1.11 python=3.8.5

# Install pytorch with cuda (optional)
conda install pytorch==1.11 torchvision torchaudio cudatoolkit=11.3 -c pytorch

# Install python dependencies
pip install -r requirements.txt

```
### Data
`*_data` folder contains random sampled data. We will release the full dataset after the paper is published. There are three files: `train.json`, `valid.json`, `test.json`  under `*_data`  folder which are used for training, validation, and testing respectively. Each file contains multiple lines. Each line represent an instance. The schema for each instance is listed below:
```

{
    "title":        #   goal of activity
    "method":       #   subgoal of activity
    "steps":        #   list of step text 
    "captions":     #   list of corresponding captions of step
    "target":       #   next step text
    "img":          #   last step image id
    "target_img":   #   next step image id
    "retrieve":     #   20 retrieved historical relevant steps
    "retrieve_neg": #   list of retrieved top-20 most similar steps with respect to the last step. They will serve as retrieve-negatives 
    }
```


### Finetuning
You can finetune your model by running `finetune_stp_contrastive.sh` in this folder. 
```
bash finetune_stp_contrastive.sh 
```